NOSQL
in Mid 80s, rise of relational : persistence , sql transactions, reporting
sql problem: assemble lots of table into one logic entity, impedance mismatch,
in Mid 90s, rise of object database, argue endlessly
Integration database, relational dominance
web sites :
1 lots of traffic , scale things up, you bug bigger box
2 lots of small boxes, sql doesn't not work well around boxes.
so,
google --> bigtable
amazon --> dynamo
"NOSQL"
mongodb, couchdb,cassandra,hypetable,dynomite,hbase,
characteristics of nosql
non-relational, cluster-friendly, schema-less, open-source, 21st century web
data model vs relational model
Column-family | document | key-value | graph |
cassandra | mongodb | redis | neo4j |
hbase | couchdb | riak | |
ravendb | project voldemort |
- graph
- aggregate-oriented (domain-driven design: eric evans)
- key-value: hashmap persistent on disk : customer_id: 7231 metadata value= aggregate
- document : no schema , json like, transparent
anOrder["price"] * anOrder["quantity"] : implicit schema, assume "price" and "quantity", schema less, uhm, document=aggregate
- column-family
row-key +column family name+column key+column value
however, nothing is perfert.
rdbms == acid
nosql == base
graph : acid
you cann't afford to hold transaction that long
version stamp!!! offline lock: transaction boundary
consistency :
- logical
- replication
CAP:
Partition + Consistency
Partition + Availability
Consistency <==spectrum==> Availability
Consistency <== ==> Response time
driver to nosql:
large scale data , easier development
Integation database === > application database (web service)
applications control access to data. ==>nosql, not integration into database!!!
for data warehouse
agile analytics
Ployglot persistence: relational still play, choose appropriate db for your biz.
rapid response, data intensive